An Improvement of HSMM-Based Speech Synthesis by Duration-Dependent State Transition Probabilities

نویسندگان

Jing Tao

Wenju Liu

چکیده

In this paper, we propose an improvement of hidden semiMarkov model (HSMM) based speech synthesis system by durationdependent state transition probabilities. In traditional HMM algorithm, the probability of the duration of a state decreases exponentially with time, which does not provide an adequate representation of the temporal structure of speech. To overcome this limitation, HSMM, which models explicitly the state duration distribution, was proposed. However, there is still an inconsistency. Although HSMM has explicit state duration probability distributions, the state transition probabilities are duration-invariant. In this paper, we introduce duration-dependent state transition probabilities, which are able to characterize the timescale distortion at particular instant of an utterance more effectively, into HSMM based speech synthesis system. Correspondingly we improve forward-backward algorithm and re-derive parameter re-estimation formulae. Experimental results show that the proposed method improves the naturalness of the synthesized speech.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Explicit duration modelling in HMM-based speech synthesis using a hybrid hidden Markov model-multilayer perceptron

In HMM-based speech synthesis, it is important to correctly model duration because it has a significant effect on the perceptual quality of speech, such as rhythm. For this reason, hidden semi-Markov model (HSMM) is commonly used to explicitly model duration instead of using the implicit state duration model of HMM through its transition probabilities. The cost of using HSMM to improve duration...

متن کامل

Hidden semi-Markov model based speech synthesis

In the present paper, a hidden-semi Markov model (HSMM) based speech synthesis system is proposed. In a hidden Markov model (HMM) based speech synthesis system which we have proposed, rhythm and tempo are controlled by state duration probability distributions modeled by single Gaussian distributions. To synthesis speech, it constructs a sentence HMM corresponding to an arbitralily given text an...

متن کامل

MLLR adaptation for hidden semi-Markov model based speech synthesis

This paper describes an extension of maximum likelihood linear regression (MLLR) to hidden semi-Markov model (HSMM) and presents an adaptation technique of phoneme/state duration for an HMM-based speech synthesis system using HSMMs. The HSMM-based MLLR technique can realize the simultaneous adaptation of output distributions and state duration distributions. We focus on describing mathematical ...

متن کامل

Improving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM

Improving phoneme recognition has attracted the attention of many researchers due to its applications in various fields of speech processing. Recent research achievements show that using deep neural network (DNN) in speech recognition systems significantly improves the performance of these systems. There are two phases in DNN-based phoneme recognition systems including training and testing. Mos...

متن کامل

A Bayesian approach to Hidden Semi-Markov Model based speech synthesis

This paper proposes a Bayesian approach to hidden semiMarkov model (HSMM) based speech synthesis. Recently, hidden Markov model (HMM) based speech synthesis based on the Bayesian approach was proposed. The Bayesian approach is a statistical technique for estimating reliable predictive distributions by treating model parameters as random variables. In the Bayesian approach, all processes for con...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2009

An Improvement of HSMM-Based Speech Synthesis by Duration-Dependent State Transition Probabilities

نویسندگان

چکیده

منابع مشابه

Explicit duration modelling in HMM-based speech synthesis using a hybrid hidden Markov model-multilayer perceptron

Hidden semi-Markov model based speech synthesis

MLLR adaptation for hidden semi-Markov model based speech synthesis

Improving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM

A Bayesian approach to Hidden Semi-Markov Model based speech synthesis

عنوان ژورنال:

اشتراک گذاری